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Abstract 

We conjecture that certain patterns (scars), theoretically and nu- 
merically predicted to be formed by electrons arranged on a sphere 
to minimize the repulsive Coulomb potential (the Thomson problem) 
and experimentally found in spherical crystals formed by self-assembled 
polystyrene beads (an instance of the generalized Thomson problem), 
could be relevant to extend the classic Caspar and Klug construction 
for icosahedrally-shaped virus capsids. The main idea is that scars 
could be produced at an intermediate stage of the assembly of the 
virus capsids and the release of the bending energy present in scars 
into stretching energy could allow for a variety of non-spherical capsids' 
shapes. The conjecture can be tested in experiments on the assembly 
of artificial protein-cages where these scars should appear. 



* Invited talk by A.I. at the Fourth International Summer School and Work- 
shop on Nuclear Physics Methods and Accelerators in Biology and Medicine 
- Prague, 8-19 July 2007. 
PACS: 87.10.+e, 87.15.Kg, 61.72.-y 

Keywords: Virus structure, Biomembranes, Crystal Defects 



E-mail: iorio@ipnp.troja.mff.cuni.cz 
2 E-mail: tcss@mahendra.iacs. res. in 



1 Virus Structure 



General Considerations Viruses are small pieces of genetic material 
(DNA or RNA) that can efficiently encode few small identical proteins that 
then assemble themselves^] to form "cages" around the genetic material pQ . 
These cages are called capsids and their shape is the main concern of this 
paper. 

Capsids are essential for protecting the genetic material and contribute 
to identifying cells suitable for the duplication of the genetic material!. Un- 
derstanding the way proteins arrange to form these very resistent capsids is 
important: if we could find a way to undo these constructions we would be 
able to destroy viruses. There are three classes of capsid's shapes pQ: he- 
lical (the proteins spiralize counter-clockwise around the genetic material), 
icosahedral or simple (the proteins arrange in morphological units of 5 and 6 
following precise geometrical and topological prescriptions, as we shall soon 
explain), complex (sphero-cylindrical, conical, tubular or even more compli- 
cated shapes, i.e. without a precise resemblance to any particular regular 
polyhedron). There are also polymorphic viruses that change their shape, 
e.g., from icosahedral to tubular and enveloped viruses that, in addition to 
the protein-capsid, also have an outer lipid bilayer (the viral envelope) taken 
by the host cell membranes. 

Icosahedral Viruses In 1956 Crick and Watson [2] proposed that small 
viruses have capsids with the identical proteins (or subunits or structural 
units) arranged into morphological units called capsomers with the shape 
of hexagons and pentagons, called hexamers and pentamers, respectively. 
These capsomers form polyhedrons that go under the name of icosadelta- 
hedrons, with a fixed number, 12, of pentamers and a variable number of 
hexamers. Following Crick and Watson's seminal idea, Caspar and Klug 
(CK) |3j later extended the class of viruses to which this construction ap- 
plies to what they called "simple viruses", i.e. still roughly spherical but 
not necessarily small viruses. The CK model for icosadeltahedral capsids is 
nowadays universally accepted by virologists [I] , [1] . 

The fact that exactly 12 pentamers are necessary is easily understood 
if we look at this problem as the analogue problem of tiling a sphere with 
pentagons and hexagons and we take into account the topological proper- 
ties of the sphere (Euler theorem, see, e.g., [5]). The precise number of 
hexamers will not be fixed by this argument and needs further assumptions 
that we shall discuss in the next paragraph. The argument goes like this: 

3 Sometimes, for large viruses, the assembly is done with the help of other proteins 
encoded to this end by the genetic material. The environment also plays an important 
role. 

4 Viruses are not able to duplicate without the help of the host cell, that is why their 
living nature is debatable. 
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Figure 1: Planar hexagonal lattice of identical rigid proteins. The vector 
A = ha + kb corresponds to h = 1 and k = 3. 

Suppose that the ends of proteins only join three at the time. If N p is the 
number of p-mers used to tile a sphere of unit radius, i.e. N$ pentamers 
and Nq hexamers, the resulting polyhedron P has Vp = 1/3 X) at N p p ver- 
tices, Ep = 1/2J2n N p p edges, and Fp = J2n faces, giving for the Euler 
characteristic x = Vp — Ep + Fp, the following expression 

£(6 - P )N P = 6 X = 12 , (1) 

N 

since for a sphere x = 2. Explicitly Eq.flJJ reads 

(6 - 5) 7V 5 + (6 - 6) iV 6 = 12 (2) 

hence to tile a sphere N§ = 12 is required, but N$ can be arbitrary. As said, 
for virus capsids Nq is not arbitrary but must be a specific number that we 
shall soon obtain. For the mathematical problem of the tiling of the sphere 
one might also imagine to use heptagons. In that case the Euler formula (JTJ) 
gives 

N 5 - N 7 = 12 . (3) 

Thus, starting from the tiling of the sphere with exactly 12 pentagons (and 
an arbitrary number of hexagons) one can add pairs pentagon-heptagon, but 
not a pentagon or a heptagon separately. Note that at this point this is only 
a mathematical consideration and its relevance for virus structure is all to 
be proved. 

The geometric interpretation of the Euler formula (JT]) is that a sphere 
of unit radius has curvature -R S phere = +1 and each polygon contributes to 
this curvature with R p = (6 — p)/12: a hexagon with Rq = 0, a pentagon 
with i?5 = +1/12, a heptagon with Rj = —1/12. This can be understood by 
constructing hexagons, pentagons and heptagons out of equilateral triangles 
of paper. A hexagon is obtained by joining together 6 triangles: they all stay 
in a plane. Take one triangle out and join what is left to make a pentagon 
and the resulting figure will bend outwards, while adding one triangle to 
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1 vertex 




Figure 2: The equilateral triangles template and the icosadeltahedron. The 
10 circled points on the planar template correspond to the 10 inner vertices 
of the solid, while all the outer vertices of the 5 upper triangles correspond 
to the north pole vertex of the solid and all the outer vertices of the 5 
lower triangles correspond to the south pole vertex. At these locations the 
hexamers turn into pentamers. Each triangular face of the icosadeltahedron 
is made of [T/2] hexamers (6 for the example of Fig(T|). 

the hexagon to make a heptagon results into an inward bending. This also 
tells us that a certain amount of bending energy Eb is necessary to convert a 
hexagon into a pentagon or into a heptagon. How big is Ef, depends on the 
elastic properties of the material used. Let us now describe in more detail 
the CK construction. 

The CK construction Suppose that the proteins are arranged on a plane 
to form the hexagonal lattice of Fig|TJ Each side of the lattice represents 
a real protein. The basic vectors a and b, with \a\ = \b\, join the center of 
the hexagon taken as the origin of the lattice with the centers of the nearest 
hexagons as in figure. The angle is evidently ip = 60°. The 3-dimensional 
polyhedron these proteins will eventually form is obtained by imagining 
the 20 equilateral triangles with side \A\ = A - where A = ha + kb, and 
h, k = 0, 1, 2, ... - represented in Figl2] folded to obtain the icosahedron, the 
platonic solid with 12 vertices, 20 faces and 30 edges. Each triangle face 
of the icosahedron, contains a fixed number of hexamers that are the real 
proteins. At each of the 12 vertices the hexamers must turn into pentamers 
for the topological and geometrical reasons described above. Say \a\ = a, 
then one has A 2 = a 2 (h 2 + k 2 + 2hk cos ip) = a 2 (h 2 + k 2 + hk) = a 2 T(h, k), 
with T(h,k) = 1,3,4,7,.... Being the area of the triangle given by = 
(v / 3/4)a 2 T(/i, k) and the area of one hexagon = (\/3/2)a 2 , the number of 
hexagons per triangle is hq = oa/oq = [T/2]. The total number of subunits 
is obtained by counting the total number of hexagons used for the planar 
lattice of FigfTJ which is iV 6 = 20(T/2) = 10T, then multiplying by 6 (the 
number of edges of the hexagon): A^ prote i ns = 60T. On the real 3-dimensional 
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-^Proteins 


T 


Feline Panleukopenia Virus 


60 


1 


Human Hepatitis B 


240 


4 


Infectious Bursal Disease Virus (IBDV) 


780 


13 


General 


60 T 


h 2 + k 2 + hk 



Table 1: Examples of viruses that follow the CK classification taken from 
Ref. @]. 

solid (that one one might think of obtaining by folding the planar template) 
the 60T proteins are arranged as: i) 60 form 12 pentamers; ii) 60(T — 1) 
form 10(T — 1) hexamers, for a total number of morphological units of N = 
10T+2. The figures obtained are icosadeltahedrons characterized by the pair 
of integers (h, k) which not only are related to the total number of proteins, 
but also give the "chirality" of the polyhedron. Viruses belonging to this 
class follow these prescriptions with great accuracy and they are classified 
according to the values of T (see Table [JJ for some examples and Ref. [4J for an 
exhaustive database on icosahedral virus structures). Recently there have 
been various attempts to generalize the CK model to include also certain 
complex viruses. One of those attempts is the model proposed in Ref. [6] - 
based on the continuum elastic theory of large spherical viruses of Ref. [7 1 
- where the authors address the problem of understanding the formation 
of spherocylindrical and conical virus capsids. Later we shall show that, if 
a change in the texture of the arrangement of proteins (scar) takes place, 
those and many more shapes could be obtained. 

2 Lessons from the Thomson Problem 

Thomson Problem Let us now turn our attention to a different but ge- 
ometrically related physical set-up from which we would like to gain some 
insights for the generalization of the CK construction we are looking for: 
the Thomson problem [$]. It consists of determining the minimum energy 
configuration for a collection of electrons constrained to move on the surface 
of a sphere and interacting via the Coulomb potential. This old (and largely 
unsolved) problem has many generalizations for more general repulsive po- 
tentials as well as for topological defects rather than unit electric charges 
[9], [10]. The fact that the two problems (virus capsids construction and 
equilibrium configurations for charges on a sphere) are intimately related 
can be seen from the numerical results for the Thomson problem that have 
been obtained over the years. In Ref. [UJ the authors proposed as solution 
of the problem an arrangement of N electrons on the sphere into a triangular 
lattice where each electron has 6 nearest neighbors sitting at the vertices of 
an hexagon, with the exception of 12 locations where the nearest neighbors 
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are only 5 sitting at the vertices of a pentagon and N = 10T + 2, with 
T = h 2 + k 2 + hk: that is the icosadeltahedron. Note that in this case the 
electrons are constrained to be on the surface of the sphere, e.g. imagining 
the sphere as a metal, while the proteins have not such constraint. Further- 
more, the polygons here are "imaginary" , in the sense that only the vertices 
are real particles, while the edges are not. 

Scars (and Pentagonal Buttons) Further studies [12J have shown that, 
even for N = 10T + 2 electrons, when T is large enough (of the order of 
10 2 ), configurations which differ from the icosadeltahedron have lower en- 
ergy than the corresponding icosadeltahedron. That is, when near one of the 
12 pentagons two hexagons (let us call this a 5-6-6 structure) are replaced by 
a pair heptagon-pentagon (let us call this a 5-7-5 structure) to form a linear 
pattern called scar, the energy is lower than that of a configuration of 12 
pentagons and all the rest hexagons. This indeed happens in numerical sim- 
ulations for higher and higher number of electrons, where the scars become 
longer (e.g. 5-7-5-7-5, etc.), always respect the topological/geometrical con- 
straint of Eq. ([3]), can spiralize or might even form exotic patterns like two 
nested pentagonal structures with five pentagons placed at the vertices of the 
outer pentagonal structure, five heptagons at the vertices of the inner pen- 
tagonal structure, and a pentagon in the common center (the vertex of the 
icosadeltahedron) (see, e.g., [S] and references therein). The latter patterns 
are called pentagonal buttons and an explanation of their topological origin 
can be found in Ref.[5]. Apparently, even more complicated structures can 
appear in numerical simulations [9]. Scars have been experimentally found 
to be formed in spherical crystals of mutually repelling polystyrene beads 
self-assembled on water droplets in oil [13]. The repulsive potential there is 
not the Coulomb potential, hence that is a particular instance of the gener- 
alized Thomson problem. These experimental findings confirm that, at least 
in the case of scars, things go along the lines of the above outlined analysis. 

The lesson we learn from the Thomson problem is that under certain 
conditions it is energetically favorable to convert a pair 6-6, with zero total 
and local curvature (0 = + 0) and zero bending energy, into a pair 5- 
7, again with zero total curvature but with nonzero local curvature (0 = 
+1/12 — 1/12) hence with nonzero bending energy given by 2Eb, where Ef, 
is necessary to convert a 6 into a 5 or into a 7. 

3 Scars and Virus Structure 

Our Conjecture What we propose here is that, due to the interaction 
with the environment (and/or with the genetic material), the formation of 
scars of pentamers and heptamers can take place in virus capsids during the 
process of assembly of the proteins. The way we believe this happens is as 
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follows: i) At first the proteins assemble to make an icosadeltahedron follow- 
ing the CK prescription, ii) At an intermediate stage, due to the interaction 
with the environment they form scars near the location of one or more of 
the 12 pentamers at the vertices of the icosadeltahedron. This interaction is 
necessary because the needed extra bending energy (2Eb in the case of the 
formation of what we might call a "simple" scar: 5-7-5) can only come from 
the environment, iii) Eventually, the capsids change shape, from spherical 
to non-spherical via the release of the bending energy into stretching en- 
ergy at the location of the scar with the consequent "annihilation" of the 
5-7 pair into a 6-6 pair. The resulting capsid has the usual morphological 
units, pentamers and hexamers, but not the spherical shape. Thus it is to 
be expected that in real viruses scars should not be visible in the final stage 
but they should drive a change in shape from spherical to non-spherical. 
It is plausible, though, that i) in experiments where artificial virus capsids 
are synthesized, scars could be actually seen at an intermediate stage of the 
assembly when the "would-be-capsid" is frozen at a suitable point in time; 
ii) not all scars are annihilated, hence some of them could be visible on the 
final capsid. Note that in the presence of scars, the total number of proteins 
needed is the same as for the icosadeltahedral case without scars (this fol- 
lows from 6 + 6 = 5 + 7) while the number and type of morphological units 
changes (for one simple scar: 13 pentamers, 1 heptamer, 10T — 12 hexamers, 
etc.). 

As said earlier, there is a strong interest today in trying to generalize the 
CK construction to include non-spherical viruses, important examples being 
the retroviruses that have spherical, spherocylindrical and conical capsids 
(see, e.g., Ref.[14j and references therein). In the work of Ref.[6] the proposal 
that spontaneous curvature of the proteins in the capsids can drive a change 
in shape from spherical to spherocylindrical or conical shapes is extensively 
studied and the geometric construction of certain capsids (spherocilyndrical 
and conical) is carried out. The application of that approach to the case 
of retroviruses is then performed in Ref.[14j, where the importance of the 
environment for the assembly of retrovirus capsids is pointed out. What 
we conjecture here is that the basis of these phenomena is the formation of 
scars. Our belief is based on the following observations: i) Scars appear in 
the geometrically related (generalized) Thomson problem; ii) Their forma- 
tion/annihilation mechanism here seems to us a natural way to convert the 
energy given by the environment into bending energy (formation) and sub- 
sequently into stretching energy (annihilation); iii) This way a mechanism 
for producing a great variety of shapes (not only the spherocilyndrical or 
conical) is in place: the formation/annihilation of scars (simple or complex) 
in different locations on the intermediate icosadeltahedron (we suppose that 
this has to happen near the vertices). 

Other authors have speculated that scars should occur in virus capsids 
[13j . They expect scars to be formed only on large viruses and that means 
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Figure 3: The intermediate spherical (icosadeltahedral) capsid with the C5- 
symmetric distribution of simple scars. 



that they are expecting scars to be seen on the final capsid. This is an 
instance that we do not exclude but that is not essential for us as our main 
proposal is to ascribe the shape change to the scars formation/annihilation 
mechanism. 

Variety of shapes It is easy to convince oneself that indeed a great variety 
of shapes could be obtained via the scar formation/annihilation mechanism: 
At the site on the intermediate icosadeltahedron where the scar is formed 
and then annihilated the sphere gets stretched. The amount of stretching 
depends on the complexity of the scaid. The scars could be formed symmet- 
rically (as we shall see in the next paragraph, for a particular symmetry of 
formation of scars we shall naturally obtain the spherocylindrical shape) or 
asymmetrically hence giving rise to regular or irregular shapes. Of all these 
very large number of shapes only a subset will describe real virus capsids 
because not all the shapes will be stable or energetically favored. A system- 
atic study can be carried on using this method and case by case it could 
be seen whether it fits with the elastic properties of the virus capsids and 
with the constraints coming from the environment [H]. What we shall do 
now is to construct, within our framework, one particular shape, the sphe- 
rocylindrical. This will give us the opportunity to show how the method 
of construction works in a case that it is known to correspond to real virus 
capsids, like, e.g., certain bacteriophages. 

Spherocylindrical Capsids Suppose that the intermediate icosadelta- 
hedron is formed. We can then refer to the hexagonal lattice and to the 
template of Fig|T]and FigJ2J Let us imagine that the scars, e.g. all simple, 
are created only near the 10 inner vertices via a mechanism that respects 

Complex scars might not be that rare as the same amount of energy is needed for the 
formation of, say, one next-to-simple scar (5-7-5-7-5) and two simple scars, i.e. 4Eb. 
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Scars are Scars are 




Figure 4: The generalized CK construction of the template driven by the 
scars formation-annihilation mechanism. 

the C5 rotation symmetrjo around the north pole-south pole axi S0. In FigE] 
the vertices where the scars are formed are indicated with •, while the other 
two are indicated with o. Take a pair of the equilateral triangles of that 
template: any one from one of the outer layers of five triangles (e.g. the 
layer of triangles that correspond to the north pole) and the one from the 
inner layer that shares an edge with it. In FigJ5]of such pairs is shown and 
the different nature of the vertices is represented like in Figj3j The scars are 
distributed in a way that is asymmetric with respect to the two triangles, 
hence the net effect of their formation/annihilation mechanism will deform 
them differently. Depending on the actual orientation of the scar around the 
given vertex the deformation will be different. To obtain the spherocylin- 
drical capsid the three scars should make the lower triangle thinner and 
longer (they stretch the area and make it bigger) and this has the effect of 
shrinking the upper triangle by making the common edge shorter. Due to 
the symmetry of the location of scars the two edges of the new lower trian- 
gle have to be the same. If this mechanism takes place in the same fashion 
for all the ten paira^l of triangles of the template of FigQ] the resulting new 
template is the one given in FigJU We require that this mechanism is area 
preserving, i.e. that the total number of proteins needed is the same as 
the one needed for the icosadeltahedron, they are only rearranged. This is 
obtained by requiring that 2a a = o-i + »2, where ot\ is the area of the upper 
new triangle and a% the area of the lower new triangle in FigUJ This means 

6 C n is the finite group of rotations of angles 2iv/n, with n — 1,2,3,.... C5 is one 
of the subgroups of the icosahedral group, the group of all possible symmetries of the 
icosahedron. Its relevance for the Thomson problem has been understood in [5] where a 
mechanism of spontaneous symmetry breaking is seen as the responsible for some of the 
patterns found in numerical simulations. Here our introduction of the C5 symmetry is 
motivated solely by the need to build up the spherocylinder. 

7 Of course the axis is completely arbitrary as long as it encompasses two opposite 
vertices. 

8 Five north pole triangles paired with their common-edge inner triangles and five south 
pole triangles paired with their common-edge inner triangles. 
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Figure 5: The spherocylindrical capsid. 

that the three quantities must be related as 

V3A 2 = B Utfl^ + J&J^ , (4) 

with B < A and C > A. Recall that, for a = 1, A 2 = T = h 2 + k 2 + hk, 
hence the final capsid, obtained by folding the new template of FigS] (see 
FigE]), will have (12 pentamers and) the 10(T — 1) hexamers distributed 
differently with respect to the intermediate icosadeltahedron. 

Notice that this spherocylinder is slightly different from the one obtained 
in [6] as the upper and lower half-icosadeltahedrons are not obtained by 
folding five equilateral triangles but five isosceles triangles (in this sense 
they are no longer proper half-icosadeltahedrons but a deformation of them). 
This is an instance that could be experimentally tested. 

From this construction it is clear that a variety of shapes could be ob- 
tained this way. For instance, if the orientation of the scars in the previous 
setting is such that C shrinks, hence B becomes longer, then a disk-like 
shape is obtained. Let us stress here again that for this to correspond to 
real virus capsids one needs more detailed information on the elastic prop- 
erties of the proteins and of the interaction with the environment. 

4 Conclusions 

In this paper we propose a mechanism of formation and subsequent anni- 
hilation of scars of pentamers-heptamers at an intermediate stage of the 
assembly of the virus capsid as the responsible for a great variety of non- 
spherical virus shapes. Our conjecture is based on the fact that scars are 
found in the (generalized) Thomson problem, in experiments and in numer- 
ical simulations, and on the observation that this mechanism would give a 
simple and plausible explanation of how the energy provided by the envi- 
ronment is converted into a change of capsid's shape. The conjecture can be 
tested, for instance, in experiments where artificial capsids are synthesized. 
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Scars should appear on what we called here the intermediate icosadeltahe- 
dron, then should drive the change in shape. Capsids that could perhaps be 
used to this end are those relative to viruses that are known to have non- 
spherical final shape but still pentamers and hexamers as morphological 
units, like for instance certain bacteriophages. This conjecture, if experi- 
mentally confirmed, would extend the classic Caspar and Klug construction 
for icosahedral viruses to include viruses that still have pentamers and hex- 
amers as morphological units but no longer are icosadeltahedrons. 

Let us conclude by making the remark that a better understanding of 
the way virus capsids are formed might suggest ways of destroying a virus 
by, for example, making the capsid unstable. 
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